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Abstract (Document Summary) 

The milk of livestock can be modified dramatically by introducing foreign DNA into the germline. Exclusive 
expression of DNA is ensured by the presence of regulatory sequences from mammary gland-specific genes. 
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Production of proteins in the milk of transgenic livestock: problems, solutions, and successes(1,2) 
INTRODUCTION 

Genetic engineering has made it feasible to clone the protein-encoding genes from any organism. These 
recombinant genes can be expressed in a variety of systems ranging from simple microbial systems such as 
bacteria, yeast, and cultured mammalian cells to complex eukaryotic systems such as transgenic plants and 
animals. When the primary objective is to produce and purify a specific protein or group of proteins, the choice of 
system is dictated by many factors, including cost of production, amount of protein required, intended use, length of 
time required for development, and nature of the target proteins. As I examine below, many eukaryotic secretory 
proteins that are complex or are required in large amounts are best made in the milk of transgenic animals. To 
date, most transgenic proteins that have been expressed in milk normally originate in tissues other than mammary 
gland. However, the most natural use of this technology must include modifying the protein content of milk by 
manipulating the milk-protein genes themselves. I explore ideas for such a program that involve over-expression of 
alpha-lactalbumin. 

Finally, it is axiomatic that greater knowledge of a biological system facilitates designing experiments to manipulate 
the system for useful purposes. a-Lactalbumin has two functions: in several species, including humans, it is an 
important nutritional component of milk and in most species it is responsible for the synthesis of lactose, which has 
long been implicated as the principal osmotic factor in milk formation (1). In the last section of this article I discuss 
the effect on milk formation in mice of removing the murine alpha-lactalbumin gene and replacing it with its human 
homologue. 

TRANSGENIC TECHNOLOGY: FUNDAMENTAL ISSUES 
Choice of protein 

Most of the proteins that are targets of the biotechnology and food industries are secretory in nature and many of 
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them assume a highly convoluted but reproducible conformation as they pass through the secretory pathway. In 
eukaryotic cells, many of these proteins undergo complex posttranslational modifications, leading to alterations that 
may be essential for functionality. It has long been known that prokaryotic systems like bacteria cannot perform 
most of these modifications, and that even the folding of a protein (eg, of human serum albumin) can be subtly 
different when the protein is made in bacteria. Even though yeast and higher plants can make many of the 
modifications, they are restricted in the scope of the modifications conferred; this restriction can prove 
unacceptable, particularly when the protein is intended for therapeutic use. A choice must be made between 
mammalian cell culture, with its proven track record of producing safe and effective therapeutic products, and 
transgenic animal technology leading to milk production, which is clearly superior in production yields and costs, but 
has yet to clear regulatory hurdles and may have some disadvantages in the length of time required for 
development and in consumer acceptance. 

Choice of transgenic species 

The generation times and relevant developmental events for the major domestic livestock as well as for mice and 
rabbits are shown in Table 1 . (All tables omitted) A significant feature is the time to the birth of the G sub 2 and G 
sub 3 animals, because these times correspond to the start of lactation in what could be the production animals. 
The start of lactation of the Go females is also a salient feature of these generation times because traditionally this 
is the earliest guide to the potential productivity of the animals and their future progeny. However, we recently 
obtained such predictive information much earlier in sheep by using hormonally induced, premature lactation. 
Nonetheless, it is clear from Table 1 that a considerable latent period exists between the beginning of a transgenic 
program in livestock and first milk, a feature made less desirable by our present inability to custom-design 
expression levels. Because of the unavoidable delays associated with livestock production, it is routine for 
transgenic programs to begin with a pilot' study in mice. The pilot study can furnish valuable information on the 
adequacy of the design of the DNA construct and the ability of the mammary gland to produce proteins of the 
desired quality. The murine model, however, is not totally reliable because both protein yields and protein quality 
are greater from larger animals such as sheep (2) and pigs (3). The length of time to milk production is obviously a 
major factor in the choice of species. However, other considerations apply, such as the disease status of the 
animals, litter size, and volume of milk. The relative merits of each species according to a variety of criteria are 
shown in Table 2. 

Choice of DNA construct 

The most common method of making transgenic mammals is to inject DNA that encodes the desired protein into 
the pronucleus of a fertilized embryo. The embryo is transferred to a foster mother, where it continues its 
development to birth. Expression of the DNA in the target tissue is effected by fusing the region of DNA that 
encodes the protein to a regulatory DNA sequence that is known to be active in that tissue. The first transgenic 
mammals were made with intronless viral or mammalian DNAs (4, 5). These DNA constructs contained 
inappropriate or truncated regulatory sequences that did not behave in a tissue-specific way. Milk-specific 
expression is now effected by fusing the target gene downstream of the regulatory sequence obtained from any of 
a variety of milk-protein genes, including genes encoding murine whey acidic protein (WAP), ovine beta- 
lactoglobulin, bovine a-lactalbumin, bovine alpahsl -casein, and caprine beta-casein (6). Although WAP-driven 
genes have proved problematic in pigs (7) because of premature termination of lactation, genes driven by other 
milk-protein regulatory sequences are approximately equivalent. However, the nature of the target gene 
configuration does seem important: a variety of experiments in mice have shown that genomic DNA is almost 
always superior to complementary DNA (cDNA) (Table 3). This trend was clearly shown in sheep for the production 
of al-antitrypsin (AAT) (2, 12, 15; Table 4). 

Even with the availability of genomic DNA, problems still arise because of the random nature of the chromosomal 
integration of the foreign DNA. This randomness leads to unpredictable, and often disappointing, levels of 
expression. The discovery of autonomously acting and tissue-specific locus control regions (LCRs) that confer copy 
number-dependent expression on the target gene has led to one remedy (22). Unfortunately, no mammary gland- 
specific LCRs have yet been isolated. Another remedy might be to use very large pieces of DNA (>100 kb) in the 
form of yeast artificial chromosomes (YACs) because, in principle, these might contain the requisite LCRs even if 
they have not been identified formally. Transgenic animals have been made with YACs and their expression 
phenotypes resemble those expected for genes that are accompanied by LCRs (23). 

Choice of founder 

Founder transgenic animals usually have one or more copies of the transgene inserted into a locus on a single 
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chromosome. These animals are therefore heterozygous for the transgene (strictly speaking, hemizygous, because 
there is no corresponding allele on the sister chromosome). Because the insertion process is random, every 
founder has a different genotype. As a result, expression levels in different founder animals that contain the same 
gene construct can vary widely (12). In addition, not all founders pass on any or all of their complement of 
transgenes intact (3, 12). The reasons for this are not completely understood, but the frequent integration of several 
copies of a gene at a single locus might stimulate recombination events at that locus. Furthermore, many founder 
animals are mosaic (not all cells or tissues are transgenic); a germ tissue that is nontransgenic could account for 
the lack of transmission. Because stability of inheritance and protein expression are imperative to any long-term 
production strategy, several founders have to be maintained and evaluated simultaneously. This situation could 
probably be resolved with the availability of livestock embryonic stem cells, as discussed in another article in this 
supplement (24). 

Purification and validation 

The criticism that followed early attempts at attracting commercial interest in the use of transgenic milk was that 
specific proteins in milk are difficult to purify to pharmaceutical standards at reasonable cost. We showed that these 
early concerns were unfounded because it proved easy to purify products such as human AAT to well greater than 
99% purity. The secret is in removing the lipid, most of which is removed by low-speed centrifugation and the 
remainder by differential precipitation and chromatography. Purification challenges in separating highly homologous 
proteins remain, but these are common to all production systems. 

A second and more pernicious concern is related to animal health. In particular, two issues are specific to the 
production of proteins in animals. First, there is the issue of prions, which are responsible for neuropathic conditions 
like scrapie and bovine spongiform encephalopathy (BSE) in infected animals and are similar to the agent that 
causes Creutzfeldt-Jakob disease in humans. For sheep milk-produced proteins for parenteral administration, we 
adopted the approach of validating the purification process for removing scrapie-producing prions and of using only 
animals of New Zealand origin because that country is free from the disease. In our program of producing novel 
bovine milks for oral use, we believe that validation is not appropriate because oral ingestion of bovine or ovine milk 
has never been associated with transmission of neuropathic conditions to man; however, we do use cows from a 
BSE-free country, the United States. It is interesting to note that prion removal is not validated in the human blood 
fractionation industry despite the possibility of contamination of pooled plasma donations with blood from persons 
with Creutzfeldt-Jakob disease. 

The second issue is that, practically speaking, it is impossible to ensure that production animals retain the same 
disease status from day to day. Certain viruses are endemic to particular livestock species in all parts of the world, 
and subclinical infections could go unnoticed. Our strategy is to identify the viruses of concern and to keep the 
animals virus-free if possible but, failing that, to ensure that purification would remove those viruses even if every 
animal in a production flock or herd were subclinical^ infected. Such a strategy requires information on the 
maximum possible viral level that could occur in the milk of an infected animal. We are beginning to gather this 
information. 

Not surprisingly, regulatory authorities in both the United States and Europe have been drafting guidelines for the 
production of transgenic livestock. The European document was finalized in December 1994, and a US Food and 
Drug Administration (FDA) "Points to Consider" was published in 1995 (25). Both of these documents concentrate 
on the development in transgenic animals of products for parenteral use. Issues of genetic stability and the 
microbiological burden of production animals figure prominently in the European document, and FDA presentations 
indicate similar concerns. Regarding microbiological issues, we would expect less-stringent guidelines in instances 
in which the products are for oral use, because there is no reason to believe that transgenic production animals 
would pose any special hazards. However, genetic stability will be an issue and so will be the production of novel 
immunogenic proteins; the latter is under review in the United States and the United Kingdom regarding transgenic 
foods in general. 

TRANSGENIC TECHNOLOGY IN ACTION 
Overview 

All the published expression levels of protein from milk of transgenic livestock are shown in Table 4. Several 
different proteins of various complexity have now been expressed successfully in concentrations >1 g/L in large 
animals. This contrasts with the generally much lower concentrations in mammalian cell culture. More complex 
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proteins have now been elaborated in murine mammary glands. For example, fibrinogen is composed of six 
polypeptides: two apha, two beta, and two gamma chains. These are normally assembled in liver cells into a 
hexamer. The secreted hexamer is cleaved in the blood by thrombin to form fibrin. Human fibrinogen secreted in 
the milk of transgenic mice in concentrations >2 g/L (Table 4) is thus far indistinguishable from protein derived from 
human plasma. Again, mammalian cell culture has proved poor by comparison. 

Production of human AAT in sheep milk 

We reported previously the generation of five founder sheep containing the human AAT gene (2). alpha-Antitrypsin 
is a single-chain glycoprotein consisting of 384 amino acids that is secreted mainly from liver. Its main role is 
thought to be the inhibition of neutrophil-derived elastase. A common congenital deficiency of active AAT is 
responsible for the development of emphysema; however, because uncontrolled tissue degradation by elastase is 
also associated with the progression of other lung diseases like cystic fibrosis and adult respiratory distress 
syndrome, the provision of large quantities of AAT is likely to be medically useful. Unfortunately, before the sheep 
milk source mentioned above was available, the only source of significant amounts of therapeutically useful AAT 
was human plasma donations, and these cannot meet expected needs. Human AAT made in bacteria and yeast 
proved unsuitable and mammalian cell culture could not meet the demands for large amounts of the protein at 
reasonable costs. 

The founder females produce from 1 to 35 g AAT/L milk, and these amounts have been sustained through each of 
three lactations. Unfortunately, the transgenic locus in one of these animals, the highest-producing one, was not 
stable and her progeny all inherited different AAT copy numbers. However, the genetic locus in one of the founder 
males has been shown to be stable over four generations, and to date female progeny from three of these 
generations have produced from 13 to 18 g AAT/L consistently (12). We are now developing a production flock 
from a set of G sub 2 half-sisters that were all produced with the semen of the same G sub 1 male of the line. It is 
hoped that the AAT from these animals will enter clinical trials in 1996. The data gathered in this AAT program 
provide the most compelling demonstration of this technology. They also address a fundamental aspect of milk 
composition: we have now analyzed the milk of several different animals that produce from 35 to 47 g AAT/L. This 
production does not seem to be at the expense of other milk proteins, which continue to be produced in normal 
amounts (450 gn), a situation that differs from that seen in "overproducing" transgenic mice (26). We may expect a 
similar situation in cows, although in the elite production animals (animals of the highest pedigree related to milk- 
producing ability) an upper limit of protein concentration may be imposed by saturation of upstream processes such 
as food digestion and metabolic interconversion. Nevertheless, it seems that this technology has the potential to 
radically remodel the protein content of milk. 

Modification of milk-protein content in cows 

Clark (24) commented on the possible benefits that could accrue if milk could be custom-modified in several ways. 
So far there is only one example of the remodeling of bovine milk: the generation of animals that contain the gene 
for human lactoferrin (27). I discuss some future nutritional targets of interest in more detail. 

alpha-Lactalbumin is a small (123 amino acids), single-chain milk protein that is present in milk whey. Its 
prominence varies by species: in humans it is the major whey protein and in cows it is a minor whey component. 
Because of its well-balanced amino acid composition, a case could be made for increasing the amount of alpha- 
lactalbumin in infant formulas at the expense of beta-lactoglobulin (not present in human milk), which has a less- 
suitable amino acid composition for human infants and the presence of which in bovine-derived milk products may 
contribute to allergy problems in the young. Site-directed mutagenesis is another valuable genetic engineering 
technique that now makes it possible to custom-design alterations of amino acids of any protein for which a gene 
and a suitable expression system are available. This technique can be applied to the design of special foods with 
therapeutic properties, one example being the production of novel proteins for patients with phenylketonuria (PKU). 

PKU is a congenital disease that results from the absence of the enzyme that metabolizes phenylalanine. Patients 
with PKU require a diet low in phenylalanine. This is particularly so for infants and pregnant women. Formulas low 
in phenylalanine are available but they are unappetizing and compliance is a problem, especially after infancy. A 
remedy might be available that exploits our ability to modify milk by transgenic means and the knowledge that 
alpha-lactalbumin contains only four phenylalanine residues. The position of each of these residues.can be 
determined in the three-dimensional protein structure; therefore, site-directed mutagenesis with the goal of 
substituting other amino acids for the phenylalanine residues without disrupting the protein structure can be 
envisioned. The modified protein could be expressed in milk and then purified to provide the base for an improved 
diet for patients with PKU. Obviously, such purification would be helped if the endogenous (y-lactalbumin content 
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could be abrogated by either gene deletion or antisense technology. 

The generation of transgenic cows is labor-intensive and time-consuming. However, in many ways it is simpler than 
the current methods for other transgenic livestock. The most convenient method involves the in vitro maturation and 
fertilization of bovine oocytes that are removed from the ovaries of slaughtered animals. The fertilized embryos are 
then centrifuged and one of the exposed pronuclei is injected with the chosen DNA. The embryos are cultured for 7 
d in vitro and the surviving embryos (-=7%) are transferred to hormonally synchronized recipients. The pregnancy 
rate we obtain is ==20%. 

Published data indicate a transgenic integration rate of 5% (27), so roughly 5 of every 100 live births can be 
expected to be transgenic. At best, one transgenic animal for about every 1500 embryos injected can be expected. 
Fortunately, we are now finding it possible to inject > 800 embryos per week. The limiting factor-and also the most 
expensive-is the number of appropriate recipients required. However, because no invasive procedure is ever used, 
the recipients can be recycled. This contrasts with the present manufacture of other transgenic livestock animals, in 
which surgery is required for both the embryo donors and the recipients, thereby limiting the recycling options. 
Marginal improvements in the process can already be brought about (at great cost) by, for example, the repeated 
use of live cow embryo donors (such embryos develop better). However, significant improvements in the efficiency 
of the transgenic process will be made only when injected embryos can be screened for transgene integration 
before their transfer to recipients, when the integration rate can be improved, or, best of all, when transformable 
embryonic stem cells become available. 

The chronology of a milk-modification program using the present technology is shown in Figure 1. (Figure 1 
omitted) Significant features of the program are that a high-expressing Go male or female founder makes milk 
collection from a modest production herd possible within 5 y. However, for this to occur in a cost-effective way, 
nonroutine procedures such as premature induction of lactation of bulls and transvaginal oocyte recovery from 
cows are required. Large herds of animals can be expected after 6.5 y. Herd development would be expedited by 
the generation of homozygous animals (ie, animals in which the transgenes are present on both sister 
chromosomes) because all female offspring of these animals would be transgenic. However, it would be 9.5 y 
before these herds would be available for milking. 

alpha-Lactalbumin and milk formation 

It has been argued above that in certain situations the transgenic mammary gland is superior to the cultured 
mammalian cell as a protein-production system. However, the expression of added genes in the mammary gland 
may have unwelcome effects on both mammary gland function and the physiology of the organism as a whole. 

alpha-Lactalbumin may exert a strong influence on milk carbohydrate and fluid content through its role as a 
component of the lactose synthase complex. It associates with galactosyltransferase, a Golgi membrane protein, to 
change the substrate specificity of that enzyme and promote lactose synthesis. In many species, lactose is thought 
to be the major osmotic influence in determining milk volume. There, is a correlation between lactose, fluid, and 
alpha-lactalbumin concentrations in the milks of different species, but no causative relation has yet been 
established (28). If milk formation is indeed sensitive to cu-lactalbumin synthesis, this could affect some of the 
programs suggested above. For example, overexpression of native alpha-lactalbumin could lead to extra lactose 
formation and a more dilute milk, and an overexpressed, mutated a-lactalbumin might be a competitive inhibitor of 
lactose synthase and reduce lactose synthesis, resulting in a highly concentrated milk. 

Clearly, these issues could be resolved empirically by performing overexpression experiments and analyzing the 
results. However, a more fundamental understanding of the role of alpha-lactalbumin in milk formation would 
undoubtedly be beneficial to the experimental design. We conducted experiments in which we generated mice with 
no murine alpha-lactalbumin genes and mice in which the murine alpha-lactalbumin genes were replaced by 
human alpha-lactalbumin genes (29). A description of the phenotypes of mice that contained different combinations 
of the alpha-lactalbumin genes from the two species follows; for the original data refer to the study by Stacey et al 
(30). The alpha-lactalbumin concentrations in the milks of the different mouse lines were estimated by direct 
visualization on polyacrylamide gels as well as by quantitative adsorption chromatography with phenyl-sepharose 
columns. A normal (wildtype) mouse is designated "alphalac sup m /alphalac sup m ; this designation refers to the 
presence (alphalac sup m ) or absence (aphalac sup - ) of the murine (m) alpha-lactalbumin gene on each sister 
chromosome. Wild-type mice (aphalac sup m alphalac.sup m ) have only a small amount of alpha-lactalbumin in 
their milk (0.9 g/L) and this concentration is halved in heterozygous mice with only one copy of the gene (alphalac 
sup m/ alphalac sup - ). Nevertheless, the composition of the two milks is similar and even the small reduction in 
lactose in the heterbzygotes is not significant. These data indicate that, although synthesis of alpha-lactalbumin is 
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dependent on gene dose, alpha-lactalbumin concentrations in the cell are not limiting for lactose synthesis. 

It is of little surprise that mice with no copies of the -lactalbumin genes (alpahlac sup - /alphalac sup - ) make no 
alpha-lactalbumin and their milk is radically different from that of the mice discussed above. It is viscous and 
extremely protein- and lipid-rich and contains no lactose. Milk volumes in mice with no copies of the gene are 
dramatically reduced and these animals are unable to rear litters successfully, but all other developmental functions 
appear normal. We conclude that the presence of a-lactalbumin is essential to lactose formation and that this in 
turn drives milk volume. 

The content of alpha-lactalbumin relative to that of other milk proteins is considerably greater in human milk (16- 
25%) than in mouse milk (==0.1%). This difference is maintained when the human gene is substituted for the 
mouse gene. The human replacement gene is designated "alphalac sup h " Humanized (alphalac sup h /alphalac 
sup h ) mice secrete ==1.4 mg alpha-lactalbumin/L milk. We showed by using heterozygous (alphalac sup 
m /alphalac sup h ) mice that this increased secretion is due to a higher rate of transcription that occurs on the 
human gene. Milk from totally humanized mice is similar to that from wild-type mice except that the volumes are 
greater in humanized mice. We conclude that human (y-lactalbumin can functionally substitute for the mouse 
protein in the mouse mammary gland. We speculate that the increased volume is due to increased lactose 
formation that results from either the surplus of (y-lactalbumin protein or improved kinetic properties of the hybrid 
lactose synthase. 

* 

CONCLUSIONS 

Transgenic technology can be used to both exploit the impressive productivity of the mammary gland and provide 
information about the molecular mechanisms that underlie that productivity. Large amounts of a desired protein can 
be obtained from the milk of transgenic animals. The remaining major hurdles involve increasing the efficiency of 
generating transgenic animals, successfully addressing regulatory issues, and, particularly in the area of modified 
milk foods, maintaining the low cost of large-scale purification and overcoming public concerns. 

1 From PPL Therapeutices Ltd, Roslin, Edinburgh, Uinted Kingdom. 

2 Address reprint requests to A Colman, PPL Therapeutics Ltd, Edinburgh EH25 9PP, United Kingdom. 
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The mammary gland as a bioreactor: factors regulating the efficient expression of milk protein-based transgenes(1- 
3) 

INTRODUCTION 

The development of recombinant DNA technology coupled with the techniques of microinjection and embryo 
transfer to introduce foreign genes into the germline of animals has provided the basic tools necessary to target the 
expression of heterologous proteins to any tissue in transgenic animals (1 , 2). Applying this technology 
successfully, however, requires prior knowledge of the location of crucial regulatory elements that are required for 
tissue-specific expression as well as for appropriate hormonal and developmental regulation of the resulting 
transgene. When applied to the mammary gland, this technology has resulted in mouse models of breast cancer in 
which the functions of oncogenes and tumor-suppressor genes are studied in their appropriate developmental 
context (3, 4). In addition, this approach has made it possible to manipulate the composition of milk in transgenic 
livestock (5) and to use the mammary gland as a bioreactor to overproduce proteins of pharmaceutical importance 
(6.7). 



Our laboratory has been interested in studying the mechanisms by which hormones regulate the expression of 
differentiated function in the normal mammary gland and how these regulatory mechanisms have deviated in breast 
cancer. Almost two decades ago we began studying two rat milk-protein genes that encode beta-casein and whey 
acidic protein (WAP) as molecular markers of mammary epithelial cell terminal differentiation (8, 9). The expression 
of these genes is regulated during mammary gland development by lactogenic hormones and cell-substratum 
interactions (10, 11). Studies in transgenic mice and in transfected mammary epithelial cells have led to the 
identification of the important elements that are required for mammary-specific expression and have provided new 
insights into the mechanism of action of prolactin and glucocorticoids in regulating milk-protein gene expression 
(12, 13). Not all the important regulatory sequences are located in the regions flanking the milk-protein genes, 
however; some are in intragenic, noncoding regions (14, 15). Several general features of transgene architecture 
that involve the locations and sizes of introns and exons are also important considerations in designing constructs 
for targeting heterologous genes by using the milk-protein gene regulatory sequences (16). The application of 
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these principles has led to the successful overexpression of insulin-like growth factor I (IGF-I) in the milk of 
transgenic mice (17). These studies are summarized in this article. 

GENERAL FEATURES OF TRANSGENE ARCHITECTURE 

The architecture of the transgene that is introduced into the germline of animals by microinjection plays an 
important role in the level of expression of the transgene. DNA that is introduced by microinjection into the. 
pronucleus is usually inserted randomly into the genome as head-to-tail concatemers. Because the eukaryotic 
genome is organized into topologically constrained domains, random integration can lead to position effects in 
which transgene expression is influenced by the surrounding chromosomal sequences. Thus, in many cases the 
level of transgene expression will vary over several logarithms, depending on the site of integration, and expression 
may be observed in < 50% of the positive transgenic mice. Because of the expense and time required to generate 
transgenic livestock, this presents a major problem. 

Several examples of copy number-dependent, position-independent expression of transgenes, however, were 
reported (18) as a result of including dominant control regions 

also called locus control regions (LCRs) 

in the construct (Figure 1). (All figures omitted) LCRs are usually identified as large regions that are hypersensitive 
to nucleases like deoxyribonuclease (DNase) I and may be clustered binding sites for several transcription factors. 
LCRs may act in concert with proximal enhancer and promoter elements that surround specific genes in the locus 
to facilitate their appropriate developmental regulation. 

The casein genes are part of a large multigene locus that encompasses a 250-kb region on chromosome 6 in the 
bovine genome (M Rijnkels et al, 1995, written communication) (19). We suggested previously that LCRs may play 
a role in the coordinate expression of the individual casein genes during mammary gland development. This 
hypothesis was supported by recent studies of the expression of the individual bovine casein genes in transgenic 
mice (M Rijnkels et al, 1995, written communication) (20). Studies are in progress in our laboratory to isolate and 
characterize potential casein LCRs. 

In contrast to the casein genes that exist as a large gene cluster, the gene that encodes WAP appears to be part of 
a single gene locus. Surprisingly, copy number-dependent expression of a small (3.0-kb) rat WAP transgene was 
observed (15). This expression appears to result from the cooperative interactions between several copies of the 
integrated rat WAP transgene, and is not due to an LCR-like element (21). 

Other elements, such as matrix attachment regions (MARs) and boundary elements or insulators, may serve to 
restrict the action of enhancers and LCRs to within the appropriate domain (22). For example, coinjection of a 
chicken lysozyme MAR with the mouse gene for WAP was reported to increase the frequency of transgene 
expression, as well as to result in appropriate hormonal and developmental expression (23). These types of 
regulatory elements may determine the activity of a gene locus by establishing an appropriate chromatin domain 
and cilitating the accessibility to specific transcription factors at appropriate times during development. These 
transacting factors interact with tissue-specific enhancers and promoters that are also required for the appropriate 
expression of transgenes (Figure 1). 

Tissue-specific enhancers have been identified in both the 5* and 3' flanking, as well as intragenic, regions of genes 
(2). Enhancers are also usually characterized as sites of hypersensitivity to nucleases such as DNase I. In many 
cases these enhancers reside within the first few hundred base pairs of 5' flanking DNA, but in some instances they 
may be farther than 10 kb 5' or 3* from the site of transcription initiation. For the rat WAP gene crucial regulatory 
elements were identified == 700-800 bp 5' to the site, of transcription initiation (24), and for the rat and bovine beta- 
casein genes these elements reside in the first 200 bp of 5' flanking DNA (13) and 1500-1600 bp 5' to the 
transcription start site, respectively (25). These regulatory elements will be described in detail in the next section. 

In addition to transcriptional regulatory elements, other features of transgene architecture, such as intron placement 
and exon size, play an important role in governing expression. If possible, conserving the genomic structure of the 
heterologous gene, as illustrated in Figure 1 , is preferred. This is not always feasible for very large genes, such as 
the one that encodes IGF-I. In addition, important regulatory elements may reside in introns, as in the case of the 
immunoglobulin enhancer (26), and cause ectopic expression of the heterologous gene in tissues other than 
mammary gland. If the expressed transgene encodes a potent hormone or cytokine, ectopic expression could 



http://proquest.umixonVpqdwe^ 12/22/2006 



Document View 



Page 3 of 9 



adversely affect the health of the transgenic recipient. In situations in which it is not possible to use the entire gene, 
minigene constructs can be used, but it is not always possible to predict what effect removing specific introns will 
have on the expression of the transgene. Several general rules seem to apply, however. First, the presence of an 
intron, even a heterologous intron, located near the 5* end of the gene appears to be advantageous (16). Second, 
because the average size of internal exons in mammalian genes is only 137 nucleotides, inserting large 
complementary DNAs (cDNAs) that may contain putative splice sites should be avoided. Third, the terminal exon 
structure and downstream 3* flanking sequences are important for efficient polyadenylation and should be 
conserved whenever possible. 

The function of nontranslated as well as of coding regions within the heterologous gene should also be considered 
in the design of the transgene. Thus, for casein- and WAP-based constructs it may be advantageous to include the 
5' and 3* untranslated regions (UTRs), respectively, to facilitate transgene expression (14, 15). These may play an 
important role in the posttranscriptional regulation of transgene expression by influencing mRNA processing, 
stability, and translational efficiency. For example, two ubiquitously expressed single-stranded DNA-RNA binding 
proteins interact with both the highly conserved 5* UTR of the beta-casein mRNA and a negative regulatory region 
in the beta-casein gene promoter (14). In the mammary gland, expression of beta-casein mRNA may lead to 
sequestration of these binding proteins and increased beta-casein gene transcription. The beta-casein 5' UTR 
sequence can also increase the expression of a heterologous reporter gene, indicating that it may play a positive 
role in posttranscriptional regulation of casein gene expression. The 3' UTR of the WAP gene is more highly 
conserved than the coding region of the gene, and appears to be important for copy number-dependent expression 
of the rat WAP transgene (21). This sequence also interacts with ubiquitously expressed proteins, but the function 
of these proteins in RNA processing and gene transcription has not yet been determined. 

Finally, for appropriate secretion, a minimal requirement is the inclusion of a signal peptide sequence in the hybrid 
construct (Figure 1), but this does not always ensure the appropriate vectorial secretion into milk of all proteins. 
Posttranslational modifications and processing of targeted heterologous proteins also play a crucial role in obtaining 
biologically active proteins. Although many of the desired posttranslational modifications occur in the mammary 
gland, it is becoming evident that not all targeted heterologous proteins undergo the appropriate processing (27). 

COMPOSITE RESPONSE ELEMENTS MEDIATE HORMONAL REGULATION AND MAMMARY GLAND- 
SPECIFIC GENE EXPRESSION 

Milk-protein gene expression is regulated by the synergistic actions of the lactogenic hormones, insulin, 
glucocorticoids, and prolactin, and is critically dependent on cell-substratum interactions. Glucocorticoids appear to 
control WAP and beta-casein gene expression through distinct mechanisms that may entail both direct and indirect 
pathways (28, 29). Thus, glucocorticoid induction of beta-casein gene expression occurs with a significant lag of > 
8 h in mammary cells pretreated with insulin and prolactin and is prevented by the protein synthesis inhibitor 
cycloheximide (30). Conversely, in mammary explant cultures, glucocorticoid induction of WAP gene expression is 
rapid, and, in the presence of insulin and the absence of prolactin, results in a 68-fold increase (31). 

To localize the regulatory regions that are important for the hormanal and tissue-specific expression of the rat AP 
gene, we mapped DNase l-hypersensitive sites in the 5' flanking region of the rat WAP transgene that showed copy 
number-dependent expression in transgenic mice (24). Two lactating, mammary-specific sites that were 
hypersensitive to DNase I were identified. The region containing the distal site between 830 and 720 bp 5' to the 
transcription start site was shown to be crucial for expression in transgenic mice. Detailed analysis of this region 
has indicated that it contains several binding sites for the transcription factor nuclear factor I (NF-I) (24). This factor 
was originally identified as a factor that is required for adenovirus replication (32, 33) but different isoforms have 
been identified that vary according to the cell type and growth conditions (34). 

Surrounding the NF-I binding sites, several specific glucocorticoid receptor (GR) binding sites were also identified 
with in vitro DNase I footprinting with bacuiovirus-expressed GR (35). This region conferred dexamethasone 
inducibility to a heterologous reporter gene in transient cotransfection experiments with GR in CV1 cells (35). 
Furthermore, glucocorticoid-induced changes in transgene expression were correlated with the appearance of 
DNase l-hypersensitive sites. Immediately downstream from the GR and NF-I binding sites a consensus interferon- 
y binding site similar to that in the beta-casein promoter (36) was identified. These sites appear to mediate prolactin 
responsiveness of the milk-protein genes by tyrosine phosphorylation of a unique member of the STAT (signal 
transduction and activators of transcription) family of transcription factors, designated mammary gland factor (MGF) 
or STATS (37). 

Thus, the distal DNase l-hypersensitive region contains a cluster of transcription factor-binding sites. Moreover, 
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these sites are highly conserved in mouse and rat genes encoding WAP. To determine the functional importance of 
these sites, we introduced point mutations into the NF-I and STATS binding sites and analyzed several 
independent lines of transgenic mice (38). Transgene expression was nullified when the palindromic NF-I site or 
both NF-I binding sites were mutated; mutation of the STATS binding site reduced transgene expression by == 
90% per gene copy. These results indicated that the regulation of WAP gene expression is determined by 
cooperative interactions among several transcription factors whose binding sites make up a composite response 
element. 

This led us to propose a model in which the interplay among different transcription factors controls both the 
hormonal induction and the tissue specificity of WAP gee expression (Figure 2). During late pregnancy and 
lactation, glucocorticd stimulation may result in GR-mediated chromatin structural changes that relieve the 
repressive role of nucleosomes to create an "open window" for the interaction of other nonhistone DNA-binding 
proteins and increase the accessibility of nearby NF-I and STATS binding sites. Whether protein-protein (between 
STATS, GR, and NF-I) as well as protein-DNA interactions are required for transactivation remains to be 
established. This model can account for the dramatic and rapid effect of glucocorticoids on WAP gene expression 
and the requirement for the synergistic interaction with prolactin mediated by the induction of STATS tyrosine 
phosphorylation through the JAK-STAT signaling pathway (32, 39). 

A related but distinct mechanism appears to account for the synergistic action of glucocorticoids and prolactin on 
casein gene expression (Figure 3). A region of == 80 bp that is important for hormonal regulation was defined by 
transfection studies in the HC1 1 mammary epithelial cell line and by DNA footprinting experiments. Within this 
region are binding sites for several different transcription factors, as illustrated in Figure 3 (13, 36, 40). This 
composite element is included in the region of the rat beta-casein promoter that is capable of eliciting tissue- 
specific and developmental regulation in transgenic mice (12). 

The binding site for STATS that is present 90 bp 5' to the transcription start site was shown to be crucial for 
prolactin regulation of rat beta-casein gene expression (41). Immediately 5* to the STATS binding site is the binding 
site for the multifunctional zinc-finger transcription factor yin yang-1 (YY1), which acts as a repressor in the context 
of the beta-casein promoter (36, 42, 43). Mutating the YY1 binding site was shown to facilitate binding of STATS. 

Farther 5* to the YY1 binding site is a binding site that was reported to interact weakly with STATS (13) but that is 
also a consensus binding site for the CCAAT enhancer-binding protein (C/EBP). C/EBPs are members of a family 
of heat-stable transcription factors that contain a leucine zipper and a flanking basic DNA-binding domain. They 
have been proposed to play a major role in regulating differentiation in tissues such as liver and in adipocytes (44). 
We showed recently that recombinant C/EBPP can bind at this site, albeit somewhat more weakly than at the 
C/EBP binding site in one of the acute-phase genes. However, interaction at this site in the beta-casein promoter 
during lactation appears to be primarily with C/EBP. The relative amounts of different C/EBP family members and 
translationally regulated isoforms are altered dramatically during mammary gland development. Furthermore, in 
HC1 1 cells glucocorticoids and extracellular matrix may play major roles in regulating the different C/EBPs and their 
isoforms (45). The steroid hormone-ependent alteration in the amounts of C/EBPbeta isoforms may, therefore, 
account for the delayed response and dependence on protein synthesis of beta-casein gene expression after the 
addition of glucocorticoid. 

* 

Immediately 5' to the C/EBP binding site is a glucocorticoid response element (GRE) half-site that is capable of 
interacting weakly with the GR (40). Thus, our hypothesis is that the addition of glucocorticoid results in a change in 
the relative amounts of certain C/EBPbeta isoforms such that C/EBPalpha can then interact with GR through direct 
protein-protein interaction and with the beta-casein promoter at their suboptimal binding sites through protein-DNA 
interactions. A similar interaction was shown previously for GR and C/EBPs with the acute-phase l-acid 
glycoprotein gene promoter (46). The GR-C/EBP interaction may then help facilitate the dissodiation of YY 1, thus 
making possible the binding of prolactin-activated STATS as discussed previously. This model accounts for both 
the delayed response to glucocorticoids and the synergism between glucocorticoids and prolactin for beta-casein 
induction. 

In addition to hormones, interactions between mammary epithelial cells and extracellular matrix play an important 
role in the regulation of milk-protein gene expression (47). In the bovine beta-casein gene an upstream region 
located between -1562 and -1613 bp 5* to the transcription start site, designated bcel, was shown to confer both 
hormonal and cellsubstratum regulation to reporter genes (25). The proximal region of the bovine beta-casein gene 
does not appear sufficient to elicit this response, and contains several crucial base changes that may disrupt 
important interactions between transcription factors and DNA (Figure 3). An analysis of the bcel region shows an 
arrangement of putative 1/2 GRE, C/EBP, and STATS binding sites that is similar to that observed in the proximal 
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beta-casein promoter in rats (Figure 3). The rabbit gene encoding (alphas 1 -casein also contains both proximal and 
distal regions that are required for appropriate hormonal regulation, and the upstream regulatory region between - 
3300 and -3410 bp contains a cluster of C/EBP and STATS binding sites (48). Thus, it appears that several 
composite response elements may exist in the flanking regions of the milk-protein genes. 

Finally, concentrations of C/EBPalpha in rat hepatocytes were shown to be regulated by interactions between the 
cell and the extracellular matrix (49). Thus, it is intriguing to speculate that a similar regulation occurs in mammary 
epithelial cells and is important for the appropriate regulation of casein gene expression by both hormones and cell- 
substratum interactions. Furthermore, the association of laminin in the extracellular matrix with betal-integrin was 
shown to result in increased tyrosine phosphorylation in mammary epithelial cells (47), providing a potential 
mechanism by which the cell-substratum and prolactin signal-transduction pathways may interact. 

TARGETING THE EXPRESSION OF BIOLOGICALLY ACTIVE HUMAN IGF-i TO THE MAMARY GLAND 

Insulin-like growth factor I may play an important role in the regulation of mammary gland development, lactation, 
and tumor progression, as well as influence development of the neonate (50, 51). The human gene encoding IGF-I 
consists of six exons, spans 80 kb of DNA, and is expressed predominantly in liver cells (52, 53). Because of the 
large size of the human IGF-I gene, and taking into account the principles of transgene architecture discussed 
above, we developed a novel approach for targeting the expression of biologically active human IGF-I to the 
mammary gland. Using a polymerase chain reaction-mediated technique with splicing by overlap extension we 
replaced the exons within the 3-kb rat WAP gene with DNA fragments derived from human IGF-la cDNA (17). The 
resulting transgene, or a modification thereof, was used to target IGF-I expression to the mammary glands of 
lactating mice. J 

However, because of the presence of nonoptimal splice sites within the WAP introns, incorrect splicing was 
detected initially in the WAP-IGF-I transcripts. Site-directed mutagenesis was therefore used to modify the splice 
junctions of the excluded exons to generate modified transgenes that produced correctly spliced transcripts and the 
correct open reading frames to encode pre-pro-IGF-l and pre-pro-des(1-3)IGF-l, an IGF-I analog with a markedly 
decreased affinity for the IGF-binding proteins. With these WAP-based constructs, we detected IGF-I mRNA 
expression in 100% of the lines analyzed, but, interestingly, expression was not dependent on copy number. 
Steady state concentrations of IGF-I mRNA ranged from 0.2% to 1 3% of the endogenous mouse WAP mRNA 
concentration, but maximally were 24-fold greater than the concentration of endogenous IGF-I mRNA expressed in 
the liver of nontransgenic mice. Most important, the IGF-I was secreted into the whey fraction of milk from 
transgenic mice and it was biologically active. Concentrations of IGF-I in the whey fraction as high as 55 and 670 
mg/L were detected through use of radioimmunoassays and bioassays, respectively. These mice are a valuable 
model for studying the role of IGF-I . and IGFI-binding proteins in the regulation of mammary gland and.neonatal 
development, lactation, and mammary carcinogenesis. These experiments illustrate the feasibility of targeting the 
expression of growth factors to the mammaly gland, resulting in their increased secretion into milk. This approach 
could be used to produce growth factors that may provide important supplements to infant formula. 

SUMMARY 

Considerable progress has been made in defining the regulatory elements in the milk-protein genes that are 
important to express heterologous proteins in milk in transgenic livestock. We are starting to understand the 
molecular mechanisms that control the expression of the milk-protein genes. However, additional studies are 
required for defining the elements necessary to ensure efficient transgene expression independent of the site of 
chromosomal integration. Finally, for the commercial potential of the mammary gland as a bidreactor to be realized, 
these studies will have to be coupled with improvements in the techniques of gene transfer into the germline of 
transgenic livestock. 

1 From the Department of Cell Biology, Baylor College of Medicine, Houston. 

2 Supported by the National Cancer Institute (grant CA16303) and the US Department of Agriculture (grant 92- 
37205-8263). 

3 Address reprint requests to JM Rosen, Department of Cell Biology, Baylor College of Medicine, One Baylor 
Plaza, Houston, TX 77030-3498. E-mail: jrosen(at)bcm.tmc.edu 
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